In this document, we benchmark the runtime performance of simona on ontologies with various sizes (i.e., on the level of 1K, 10K, 100K, 1M).
library(simona)
set.seed(123)
We write a small function which applies the "Sim_WP_1994" similarity method. "Sim_WP_1994" is based on the DAG structure where it uses the longest distance from root to the lowest common ancestor (LCA) term (i.e. the depth of LCA) and the longest distance from the LCA term to the two terms.
Denote two terms as \(a\) and \(b\), their LCA term as \(c\), \(\delta(c)\) is the depth of \(c\) in the DAG, i.e. the longest distance from the root term, \(\mathrm{len}(c, a)\) is the longest distance from \(c\) to \(a\), the "Sim_WP_1994" similarity is calculated as:
\[ \mathrm{Sim}(a, b) = \frac{2*\delta(c)}{\mathrm{len}(c, a) + \mathrm{len}(c, b) + 2*\delta(c)} \]
benchmark_runtime = function(dag, by = 200, max = 10000) {
invisible(dag_depth(dag)) # depth will be cached
n_terms = dag_n_terms(dag)
k = seq(by, min(max, floor(n_terms/by)*by), by = by)
t = rep(NA_real_, length(k))
for(i in seq_along(k)) {
message(k[i], "/", max(k), "...")
terms = sample(n_terms, k[i]) # numeric indicies are also allowed
t[i] = system.time(term_sim(dag, terms, method = "Sim_WP_1994"))[3]
}
data.frame(k = k, t = t)
}
In the common settings when we import ontology datasets, we set remove_rings = TRUE to remove rings and remove_cyclic_paths = TRUE to remove cyclic links.
The pw.owl file is downloaded from http://obofoundry.org/ontology/pw.html. It contains several thousands of terms.
dag = import_owl("~/workspace/ontology/OBOFoundry/pw/pw.owl",
remove_rings = TRUE, remove_cyclic_paths = TRUE)
dag
## An ontology_DAG object:
## Source: http://purl.obolibrary.org/obo/pw.owl, http://purl.obolibrary.org/obo/pw/7.52/pw.owl
## 2561 terms / 3142 relations
## Root: ~~all~~
## Terms: PW:0000001, PW:0000002, PW:0000003, PW:0000004, ...
## Max depth: 10
## Avg number of parents: 1.23
## Aspect ratio: 78.11:1 (based on the longest distance from root)
## 92:1 (based on the shortest distance from root)
## Relations: is_a
##
## With the following columns in the metadata data frame:
## id, short_id, name, namespace, definition
In the plot, a smooth line by loess fit is added.
df = benchmark_runtime(dag, by = 100)
plot(df$k, df$t, xlab = "Numbers of random terms", ylab = "runtime (sec)",
main = paste0("Pathway Ontology, ", dag_n_terms(dag), " terms"))
x = c(0, df$k)
y = c(0, df$t)
fit = loess(y ~ x, span = 0.5)
lines(x, predict(fit))
Figure S2-1. Runtime performance of simona on the Pathway Ontology.
Ontology is directly from the GO.db package. We use the Biological Process (BP) namespace. To be consistent to other ontologies under test, we only take the "is_a" relation type. It contains several tens of thousands of terms.
dag = create_ontology_DAG_from_GO_db(relations = NULL)
dag
## An ontology_DAG object:
## Source: GO BP / GO.db package 3.17.0
## 27942 terms / 50938 relations
## Root: GO:0008150
## Terms: GO:0000001, GO:0000002, GO:0000003, GO:0000011, ...
## Max depth: 16
## Avg number of parents: 1.82
## Aspect ratio: 385.17:1 (based on the longest distance from root)
## 751.33:1 (based on the shortest distance from root)
## Relations: is_a
##
## With the following columns in the metadata data frame:
## id, name, definition
df = benchmark_runtime(dag, by = 400, max = 20000)
plot(df$k, df$t, xlab = "Numbers of random terms", ylab = "runtime (sec)",
main = paste0("Gene Ontology (BP), ", dag_n_terms(dag), " terms"))
x = c(0, df$k)
y = c(0, df$t)
fit = loess(y ~ x, span = 0.5)
lines(x, predict(fit))
Figure S2-2. Runtime performance of simona on the Gene Ontology, biological process namespace.
The chebi.obo file is dowloaded from http://obofoundry.org/ontology/chebi.html. It contains several hundreds of thousands of terms.
dag = import_obo("~/workspace/ontology/OBOFoundry/chebi/chebi.obo",
remove_rings = TRUE, remove_cyclic_paths = TRUE)
dag
df = benchmark_runtime(dag, by = 400, max = 20000)
plot(df$k, df$t, xlab = "Numbers of random terms", ylab = "runtime (sec)",
main = paste0("Chemical Entities of Biological Interest, ", dag_n_terms(dag), " terms"))
x = c(0, df$k)
y = c(0, df$t)
fit = loess(y ~ x, span = 0.5)
lines(x, predict(fit))
Figure S2-3. Runtime performance of simona on the Chemical Entities of Biological Interest.
The ncbitaxon.owl file is downloaded from http://obofoundry.org/ontology/ncbitaxon.html. It contains several millions of terms.
dag = import_owl("~/workspace/ontology/OBOFoundry/ncbitaxon/ncbitaxon.owl",
remove_rings = TRUE, remove_cyclic_paths = TRUE)
dag
df = benchmark_runtime(dag, by = 400, max = 20000)
plot(df$k, df$t, xlab = "Numbers of random terms", ylab = "runtime (sec)",
main = paste0("NCBI organismal classification, ", dag_n_terms(dag), " terms"))
x = c(0, df$k)
y = c(0, df$t)
fit = loess(y ~ x, span = 0.5)
lines(x, predict(fit))
Figure S2-4. Runtime performance of simona on the NCBI organismal classification.
We benchmark all OBO Foundry ontologies with numbers of terms larger than 1000. We set 50 different numbers of random terms to test, from 100 to min(20000, dag_n_terms(dag)). Here the result is already generated and saved in "runtime_OBOFoundry_all.RData". The script for generating this file is run_time_OBOFoundry.R. The plots for individual ontologies can be found in the “OBO Foundry gallery” document.
There are two objects in "runtime_OBOFoundry_all.RData":
lt: a list of two-column data frames which contain numbers of random terms to test and corresponding runtime.df: a data frame that contains meta-information of each ontology.load("runtime_OBOFoundry_all.RData")
length(lt)
## [1] 99
head(names(lt))
## [1] "ado" "agro" "aism" "apollo_sv" "aro" "bcgo"
head(lt[[1]])
## k t
## 1 100 0.009
## 2 137 0.009
## 3 175 0.013
## 4 213 0.041
## 5 251 0.030
## 6 289 0.043
head(df)
## file
## ado /Users/guz/workspace/ontology/OBOFoundry/ado/ado.owl
## agro /Users/guz/workspace/ontology/OBOFoundry/agro/agro.owl
## aism /Users/guz/workspace/ontology/OBOFoundry/aism/aism.obo
## apollo_sv /Users/guz/workspace/ontology/OBOFoundry/apollo_sv/apollo_sv.owl
## aro /Users/guz/workspace/ontology/OBOFoundry/aro/aro.owl
## bcgo /Users/guz/workspace/ontology/OBOFoundry/bcgo/bcgo.owl
## id basename type method n_terms n_relations
## ado ado ado.owl owl import_owl 1960 2340
## agro agro agro.owl owl import_owl 4118 4965
## aism aism aism.obo obo import_obo 6608 10296
## apollo_sv apollo_sv apollo_sv.owl owl import_owl 1608 1613
## aro aro aro.owl owl import_owl 6897 7003
## bcgo bcgo bcgo.owl owl import_owl 2268 2856
## avg_parents avg_children asp1 asp2 depth_max depth_q99
## ado 1.194487 4.804928 39.89 72.42 30 18
## agro 1.205975 2.602201 21.77 24.80 36 31
## aism 1.558347 3.031802 37.00 41.41 41 29
## apollo_sv 1.003734 2.320863 8.64 8.67 37 33
## aro 1.015516 8.603194 385.75 378.62 8 8
## bcgo 1.259815 3.352113 19.06 40.46 19 17
In this section, we benchmark the runtime performance of simona from two apsects: on the number of query terms, and on the size of ontologies.
Different ontologies have different ranges of runtime. To make them comparable, we scale values on x-axis (i.e. numbers of terms) and values on y-axis (runtime) both into the range of [0, 1]. In the next plot, we also add lines for the linear, quadratic and cubic time complexities.
plot(NULL, xlim = c(0, 1), ylim = c(0, 1),
xlab = "Numbers of random terms, scaled", ylab = "runtime, scaled",
main = "Compare runtime performance of OBO Foundry ontologies")
for(i in seq_along(lt)) {
x = lt[[i]]$k
y = lt[[i]]$t
x = x/max(x)
y = y/max(y)
lines(x, y, col = "#00000080")
}
curve(x^1, from = 0, to = 1, col = 2, lty = 2, add = TRUE)
curve(x^2, from = 0, to = 1, col = 3, lty = 2, add = TRUE)
curve(x^3, from = 0, to = 1, col = 4, lty = 2, add = TRUE)
legend("topleft", lty = 2, col = 2:4, legend = c("O(n)", "O(n^2)", "O(n^3)"))
Figure S2-5. Compare runtime performance on various ontologies. Runtime performance is scaled into [0, 1] in the plot.
In the plot, if a curve bends more to the bottom right of the plotting region, it means the time complexity is worse. The plot shows for most of the ontologies, simona has a non-linear time complexity for calculating similarities of \(n\) query terms, close to \(O(n^2)\), but there are a few ontologies, on which simona shows a nearly-linear time complexity.
Since the previous plot scales values on both x-axis and y-axis into [0, 1], we can measure the difference to the linear time complexity by calculating the area difference of the curve to y = x (the diagonal line, the red line). If the curve bends more to the bottom left of the plotting region, it means the it has worse time complexity. The following function rel_diff() calculates such relative time complexity difference. The third argument max_x is used for setting the maximal number of random terms in the benchmark procedures, which will be used later.
rel_diff = function(x, y, max_x = NULL) {
if(missing(y)) {
y = x[[2]]
x = x[[1]]
}
if(!is.null(max_x)) {
l = x <= max_x
x = x[l]
y = y[l]
}
od = order(x)
x = x[od]
y = y[od]
n = length(x)
x = x/max(x)
y = y/max(y)
area = sum( 0.5*(x[2:n] - x[2:n - 1])*(y[2:n] + y[2:n - 1]) )
0.5 - area
}
We find the time complexity on different ontologies is quite stable around \(O(n^2)\), but for some large ontologies, the time complexity decreases close to linear.
df$rel_diff = sapply(lt, rel_diff)
plot(df$n_terms, df$rel_diff, log = "x",
xlab = "Number of terms",
ylab = "Relative difference to linear time complexity")
abline(h = 0, lty = 2, col = 2)
abline(h = 1/6, lty = 2, col = 3) # area under quadratic curve is 1/3
abline(h = 1/4, lty = 2, col = 4) # area under cubic curve is 1/4
l = df$n_terms > 100000
text(df$n_terms[l], df$rel_diff[l], df$id[l],
adj = c(1, -0.4), col = "blue", cex = 0.8)
legend("topright", lty = 2, col = 2:4, legend = c("O(n)", "O(n^2)", "O(n^3)"))
Figure S2-6. Relative time complexity to linear complexity. Ontologies with more than 100k terms are highlighted by their names.
However, this does not mean large ontologies have better time complexity than smaller ones. Since in the benchmark procedures, the possible maximal number of random terms to pick (let’s call it max_pick) is set to 20000, with large ontologies whose numbers of terms far larger than 20000, the number of picked terms is only a tiny fraction of their total terms, thus the close-quadratic complexity is not obviously observable. Or in other words, when the number of terms to be picked is far smaller than the total number of terms in the ontology, the time complexity is close to linear.
We can validate it by setting max_pick to small values such as 5000 and 1000, shown in the following plots. When max_pick becomes smaller, such reduction to linear complexity can be observed in more ontologies.
Figure S2-7. Compare the effect of max_pick on the time complexity.
In the comparison we performed previously, we fix the ontology and check the runtime performance when increasing the number of terms used for calculating similarities. Next we do in another dimension. We fix the number of random terms and we check the runtime performance on the size of the ontology.
We define the following get_t_by_k() function. For each ontology, the runtime \(t\) of a given number of terms denoted as \(k\) is predicted by a loess fit.
get_t_by_k = function(k) {
sapply(lt, function(df) {
i0 = which(df$k == k)
if(length(i0)) {
df$t[i0]
} else {
ind1 = which(df$k < k)
if(length(ind1) == 0) { # k is smaller than all df$k
return(NA)
}
i1 = max(ind1)
ind2 = which(df$k > k)
if(length(ind2) == 0) { # k is larger than all df$k
return(NA)
}
i2 = min(ind2)
x = c(0, df$k)
y = c(0, df$t)
fit = loess(y ~ x, span = 0.5)
predict(fit, k)
}
})
}
For example, we can estimate, if we randomly sample 500 terms for calculating similarities, the different runtime performance on ontologies with different sizes:
t500 = get_t_by_k(500)
plot(df$n_terms, t500, log = "xy",
xlab = "Size of the ontology", ylab = "Runtime (sec)",
main = "Randomly sample 500 terms from each ontology")
Figure S2-8. Runtime preformance on the size of ontologies. The number of terms for calculating semantic similarities is fixed to 500.
In a double-log coordinate system, the runtime has a linear relation to the size of the ontology. We can perform a simple linear regression:
lm(log(t500) ~ log(df$n_terms))
##
## Call:
## lm(formula = log(t500) ~ log(df$n_terms))
##
## Coefficients:
## (Intercept) log(df$n_terms)
## -9.913 0.771
Let’s denote the runtime as \(t\) and the size of the ontology as \(N\), then they have a linear relation: \(\log(t) = 0.77*\log(N) - 9.91\). The slope is less than one, which means, if the ontology size increases by 2x, the runtime only increases by \(2^{0.77}\) (1.71) times, thus the time complexity at \(k = 500\) is \(O(N^{0.77})\).
We can measure the relation between runtime \(t\) and ontology size \(N\) for a list of different \(k\). The next function compare_runtime_by_k() calculates the runtime, makes the plot and performs linear regression for a given \(k\). It returns the slope from the linear regression model.
library(GetoptLong)
compare_runtime_by_k = function(k) {
tt = get_t_by_k(k)
plot(df$n_terms, tt, log = "xy",
xlab = "N / Size of the ontology", ylab = "t / Runtime (sec)",
main = qq("Randomly sample @{k} terms from each ontology"))
fit = lm(log(tt) ~ log(df$n_terms))
coef = round(coef(fit), 2)
coef[1] = ifelse(sign(coef[1]) == 1, paste0("+", coef[1]), coef[1])
text(min(df$n_terms), max(tt, na.rm = TRUE),
qq("log(t) = @{coef[2]}*log(N)@{coef[1]}"), adj = c(0, 1))
coef(fit)[2]
}
The list of \(k\) we will compare is defined as follows.
k = c(500, 600, 700, 800, 900, 1000, 2000, 3000,
4000, 5000, 6000, 7000, 8000, 9000, 10000)
We make the plot for every \(k\):
coef = sapply(k, compare_runtime_by_k)
Figure S2-9. Runtime preformance on the size of ontologies, on a list of different k.
The slope coeffcient from the linear regression model measures the increasing rate of runtime when increasing the size of the ontology. We can check how fast the runtime increases when randomly sample different numbers of terms.
plot(k, coef, xlab = "Numbers of random terms", ylab = "coef",
main = "Runtime complexity on different k")
Figure S2-10. The increasing rate of runtime when increase the ontology size at different k.
It shows when the number of query terms increases, the time complexity becomes stably around \(O(N^{0.5})\).
First we describe the data structures:
c(1, 2, 3, ..., n_all).lt_parents and lt_children. lt_parents[[i]] is a vector of integer indices of its parent terms. E.g. if lt_parents[[6]] = c(3, 4, 5), it means term 3, 4, 5 are parents of term 6. If you want to see term 3’s parents, directly go to lt_parents[[3]]. The structure is the same for lt_children where lt_children[[i]] is a vector of integer indicies of its child terms. Both lt_parents and lt_children have length of n_all. If the vector of lt_parents[[i]] is empty, it means term i is the root. If the vector of lt_children[[i]] is empty, it means term i is a leaf term.l_ is a logical vector with length of n_all. E.g. a vector l_all_ancestors stores whether term i is an ancestor term. Also to simplify the description of the algorithm, we take the l_* vectors as pointers where a value change in place affects everywhere l_* variables are used.As an example, we describe the algorithm for looking for LCA terms. The algorithm contains two steps:
Input: a vector of terms, already converted to integer indices. Output: A matrix of depth of LCA terms.
Step 1: find all ancestors of the given group of terms, including the terms themselves. They are union of ancestors of individual terms. We need to define three functions in this step.
# nodes: an integer vector
ancestors_of_a_group = function(nodes) {
l_all_ancestors = rep(FALSE, n_terms)
for(node in nodes) {
l_all_ancestor[node] = TRUE # include the nodes themselves
# it is the same as using `add_parents`, but we want to make it clear
# that this is the union of ancestors of individual nodes
add_ancestors(node, l_all_ancestor)
}
return(l_all_ancestor)
}
# node: a single integer scalar
# l_all_ancestor: a logical vector with length n_all (let's treat it as a pointer)
add_ancestors = function(node, l_all_ancestor) {
add_parents(node, l_all_ancestor)
}
# node: a single integer scalar
# l_all_ancestor: a logical vector with length n_all (let's treat it as a pointer)
add_parents = function(node, l_all_ancestor) {
parents = lt_parents[[node]] # a vector of integer indicies
if(length(parents) > 0) {
l_all_ancestor[parents] = TRUE
for(p in parents) {
add_parents(node, l_all_ancestor)
}
}
}
l_all_ancestors = ancestors_of_a_group(nodes)
all_ancestors = which(l_all_ancestors)
Looking for ancestors is applied recursively by add_parents(). Note again, we treat l_all_ancestor as a pointer so that it can be updated in the recursive parent lookup.
Step 1 generates two variables l_all_ancestors which is a logical vector representing whether term i is an ancestor term and all_ancestors which is the corresponding integer indices in l_all_ancestors of the TRUE values.
Step 2: All the common ancestors of nodes only exist in the set of l_all_ancestors or all_ancestors. We look for LCA in a top-down manner.
For each ancestor term an in all_ancestors, we first get its offspring terms but restricted in the background set l_all_ancestors. For a pair of terms in the offspring set, if both of them are also in nodes, the ancestor an is a common ancestor of the two terms. Then we compare the depth of ca and depth previously saved for the two offspring terms. We update it to the depth of ca if it is larger than the previous one.
Similarly, we need to define functions to look for offsprings, but this time we add another constrains of the backgroud. The implemntation is very similar as add_ancestors().
# nodes: an integer vector
# l_background: a logical vector with length n_all
offspring_within_background = function(node, l_background) {
l_all_offspring = rep(FALSE, n_terms)
l_all_offspring[node] = TRUE # include the term itself
add_children(node, l_all_offspring, l_background)
}
# node: a single integer scalar
# l_all_offspring: a logical vector with length n_all (a pointer)
# l_background: a logical vector with length n_all
add_children = function(node, l_all_offspring, l_background) {
children = lt_children[[node]] # a vector of integer indicies
if(length(children) > 0) {
l_all_offspring[children] = TRUE
for(ch in children) {
if(l_background[ch]) { # only continue if the child is in the background
add_children(node, l_all_offspring, l_background)
}
}
}
}
We pre-compute two variables. l_nodes corresponds to nodes but is a logical vector where the ith element is TRUE if term i is in nodes. all_depth is the depth of all terms in the ontology.
LCA_depth is the matrix that stores the depth of LCA terms. The initial values are set to zero.
n = length(nodes)
LCA_depth = matrix(0, n, n)
diag(LCA_depth) = all_depth[nodes]
for(an in all_ancestors) {
l_offspring = offspring_within_background(an, l_all_ancestors)
offspring = which(l_offspring)
no = length(offspring)
for(i in 1:(no-1)) {
if(l_nodes[ offspring[i] ]) {
for(j in (i+1):no) {
if(l_nodes[ offspring[j] ]) {
if(LCA_depth[i, j] < all_depth[an]) {
LCA_depth[i, j] = all_depth[an]
LCA_depth[j, i] = LCA_depth[i, j]
}
}
}
}
}
}
In this three levels of for-loop, in the triple (ca, offspring[i], offspring[j]), an is always an common ancestor of offspring[i], offspring[j].
We can give an approximate estimation on the time complexity of the algorithm. Let’s denote set \(A\) as the set of query terms and \(C\) and the set of union of all ancestors of terms in \(A\). Let’s denote \(a_1\) and \(a_2\) are the two terms in \(A\), and \(c\) is a common ancestor term of \(a_1\) and \(a_2\). The process involes looking for the triple \((c, a_1, a_2)\). In the algorithm we proposed, the relations in the triple are always visited once, while in some other tools, some relations are duplicated visited.
Let’s denote the number of ancestors are \(N_\mathrm{an}\) and the number of query terms are \(n\), then the approximate time complexity is \(O(N_\mathrm{an}*n^2)\).
Since ancestors can be shared, \(N_\mathrm{an} < n*d\) where \(d\) is the average depth of \(n\) query terms. When \(n\) is relatively larger, or there are more ancestors that are shared, \(N_\mathrm{an} \ll n*d\), in this way, the time complexity is between \(O(n^2)\) and \(O(n^3)\) and more closer to \(O(n^2)\).
We can validated it with two DAGs. The first one binary_tree is a tree where each term has two children and all leaf terms have the same depth of 11.
lt = list()
binary_tree = dag_random(n_children = 2, tree = TRUE, depth = 11)
lt[[1]] = benchmark_runtime(binary_tree)
In the second DAG, each term has the number of children ranging between 3 to 10. On each term, there is a probability of 0.8 that it connects to other terms lower in the DAG, with the number of terms to be connected ranging between 1 and 15. The maximal number of terms in the DAG is set to 4000.
set.seed(37)
random_dag = dag_random(n_children = c(3, 10), max = 4000, p = 0.8, power = -1, np = c(1, 15))
lt[[2]] = benchmark_runtime(random_dag)
binary_tree
## An ontology_DAG object:
## Source: Ontology
## 4095 terms / 4094 relations / a tree
## Root: 1
## Terms: 1, 10, 100, 1000, ...
## Max depth: 11
## Aspect ratio: 186.18:1
random_dag
## An ontology_DAG object:
## Source: Ontology
## 3998 terms / 20038 relations
## Root: 1
## Terms: 1, 10, 100, 1000, ...
## Max depth: 5
## Avg number of parents: 5.01
## Aspect ratio: 436.8:1 (based on the longest distance from root)
## 558.6:1 (based on the shortest distance from root)
In binary_tree, average number of parents is 1 (exclude the root), and in random_dag, the average number of parents is higher, more DAG is more densely connected in random_dag:
mean(n_parents(random_dag))
## [1] 5.012006
Let’s visualize these two DAGs:
library(grid)
pushViewport(viewport(x = 0.25, width = 0.5))
dag_circular_viz(binary_tree, newpage = FALSE)
popViewport()
pushViewport(viewport(x = 0.75, width = 0.5))
dag_circular_viz(random_dag, edge_transparency = 0.96, newpage = FALSE)
popViewport()
Figure S2-11. Visualization of the binary tree and a random dense DAG.
And we compare the runtime performance on the two DAGs with different forms.
par(mfrow = c(1, 2))
plot(NULL, xlim = c(0, 4100), ylim = c(0, 11),
xlab = "Numbers of random terms", ylab = "runtime (sec)")
for(i in seq_along(lt)) {
x = lt[[i]]$k
y = lt[[i]]$t
lines(x, y, col = i + 1)
}
legend("topleft", lty = 2, col = 2:3, legend = c("binary_tree", "random_dag"))
plot(NULL, xlim = c(0, 1), ylim = c(0, 1),
xlab = "Numbers of random terms, scaled", ylab = "runtime, scaled")
for(i in seq_along(lt)) {
x = lt[[i]]$k
y = lt[[i]]$t
x = x/max(x)
y = y/max(y)
lines(x, y, col = i + 1)
}
curve(x^1, from = 0, to = 1, lty = 2, add = TRUE)
curve(x^2, from = 0, to = 1, lty = 2, add = TRUE)
curve(x^3, from = 0, to = 1, lty = 2, add = TRUE)
legend("topleft", lty = 2, col = 2:3, legend = c("binary_tree", "random_dag"))
Figure S2-12. Runtime performance on the binary tree and the random dense DAG. Left: absolute time complexity. Right: relative time complexity.
Last, inside the second if block in the pseudo code, the code can be generalized, such as for finding MICA or calculating pairwise distance in the ontology, thus this agorithm is a general algorithm for dealing with relations of term pairs on the DAG.
sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.2.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] C/UTF-8/C/C/C/C
##
## time zone: Europe/Berlin
## tzcode source: internal
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] GetoptLong_1.0.5 simona_0.99.12 knitr_1.44 rmarkdown_2.25
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.7 bitops_1.0-7 xml2_1.3.5
## [4] shape_1.4.6 RSQLite_2.3.1 digest_0.6.33
## [7] magrittr_2.0.3 evaluate_0.22 RColorBrewer_1.1-3
## [10] iterators_1.0.14 blob_1.2.4 GO.db_3.17.0
## [13] circlize_0.4.15 fastmap_1.1.1 foreach_1.5.2
## [16] doParallel_1.0.17 jsonlite_1.8.7 AnnotationDbi_1.62.2
## [19] GenomeInfoDb_1.36.4 DBI_1.1.3 GlobalOptions_0.1.2
## [22] httr_1.4.7 ComplexHeatmap_2.16.0 Biostrings_2.68.1
## [25] codetools_0.2-19 jquerylib_0.1.4 cli_3.6.1
## [28] rlang_1.1.1 crayon_1.5.2 XVector_0.40.0
## [31] scatterplot3d_0.3-44 Biobase_2.60.0 bit64_4.0.5
## [34] cachem_1.0.8 yaml_2.3.7 tools_4.3.1
## [37] parallel_4.3.1 memoise_2.0.1 colorspace_2.1-0
## [40] GenomeInfoDbData_1.2.10 BiocGenerics_0.46.0 vctrs_0.6.4
## [43] R6_2.5.1 png_0.1-8 matrixStats_1.0.0
## [46] stats4_4.3.1 zlibbioc_1.46.0 KEGGREST_1.40.1
## [49] S4Vectors_0.38.2 IRanges_2.34.1 bit_4.0.5
## [52] clue_0.3-65 cluster_2.1.4 pkgconfig_2.0.3
## [55] bslib_0.5.1 Rcpp_1.0.11 xfun_0.40
## [58] rjson_0.2.21 htmltools_0.5.6.1 igraph_1.5.1
## [61] compiler_4.3.1 Polychrome_1.5.1 gifski_1.12.0-2
## [64] RCurl_1.98-1.12